Alignment with Non-overlapping Inversions in O(n3)-Time

نویسندگان

  • Augusto F. Vellozo
  • Carlos Eduardo Rodrigues Alves
  • Alair Pereira do Lago
چکیده

Alignments of sequences are widely used for biological sequence comparisons. Only biological events like mutations, insertions and deletions are usually modeled and other biological events like inversions are not automatically detected by the usual alignment algorithms. Alignment with inversions does not have a known polynomial algorithm and a simplification to the problem that considers only non-overlapping inversions were proposed by Schöniger and Waterman [20] in 1992 as well as a corresponding O(n) solution. An improvement to an algorithm with O(n log n)-time complexity was announced in an extended abstract [1] and, in this present paper, we give an algorithm that solves this simplified problem in O(n)-time and O(n)-space in the more general framework of an edit graph. Inversions have recently [4, 7, 13, 17] been discovered to be very important in Comparative Genomics and Scherer et al. in 2005 [11] experimentally verified inversions that were found to be polymorphic in the human genome. Moreover, 10% of the 1,576 putative inversions reported overlap RefSeq genes in the human genome. We believe our new algorithms may open the possibility to more detailed studies of inversions on DNA sequences using exact optimization algorithms and we hope this may be particularly interesting if applied to regions around known rearrangements boundaries. Scherer report 29 such cases and prioritize them as candidates for biological and evolutionary studies.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An O(n4) algorithm for alignment with non-overlapping inversions

Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kind of rearrangements. Despite many import...

متن کامل

A sparse dynamic programming algorithm for alignment with non-overlapping inversions

Alignment of sequences is widely used for biological sequence comparisons, and only biological events like mutations, insertions and deletions are considered. Other biological events like inversions are not automatically detected by the usual alignment algorithms, thus some alternative approaches have been tried in order to include inversions or other kind of rearrangements. Despite many import...

متن کامل

Alignments with Non-overlapping Moves, Inversions and Tandem Duplications in O ( n 4) Time

Sequence alignment is a central problem in bioinformatics. The classical dynamic programming algorithm aligns two sequences by optimizing over possible insertions, deletions and substitution. However, other evolutionary events can be observed, such as inversions, tandem duplications or moves (transpositions). It has been established that the extension of the problem to move operations is NP-com...

متن کامل

Efficient string-matching allowing for non-overlapping inversions

Inversions are a class of chromosomal mutations, widely regarded as one of the major mechanisms for reorganizing the genome. In this paper we present a new algorithm for the approximate string matching problem allowing for non-overlapping inversions which runs in O(nm) worst-case time and O(m2) space, for a character sequence of size n and pattern of size m. This improves upon a previous O(nm2)...

متن کامل

Efficient Matching of Biological Sequences Allowing for Non-overlapping Inversions

Inversions are a class of chromosomal mutations, widely regarded as one of the major mechanisms for reorganizing the genome. In this paper we present a new algorithm for the approximate string matching problem allowing for non-overlapping inversions which runs in O(nm) worst-case time and O(m)-space, for a character sequence of size n and pattern of size m. This improves upon a previous O(nm)-t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006